feat(dwarf): DwarfHandling::Remap end-to-end (#143 DWARF Phase 2 inc 3b) by avrabe · Pull Request #206 · pulseengine/meld

avrabe · 2026-05-29T17:48:03Z

Summary

The final piece of DWARF Phase 2 (#143): DwarfHandling::Remap turns the AddressRemap engine (v0.18.0) into a working, debug-info-preserving fusion path. It reads an input core module's .debug_* sections, translates every code address to the fused code section, and re-serializes a single remapped DWARF set via gimli::write::Dwarf::from. Exposed as meld fuse --dwarf remap.

This completes the increment arc:

Inc	Release	Anchor
1	v0.16.0	per-function base (component-provenance v2 `code_range`)
2	v0.17.0	intra-function `InstrOffsetMap` (LEB drift)
3a	v0.18.0	`AddressRemap` engine (composes 1+2)
3b	this PR	gimli `.debug_*` rewrite + `DwarfHandling::Remap`

Design — three decisions that de-risk it

Post-hoc parallel operator walk. The instruction offset map is recovered by walking the input and final-output operator streams in lockstep, not captured during the merge. The merge re-rewrites bodies after adapter wiring (lib.rs:694), so a captured map would go stale; the post-hoc walk reflects whatever rewriting actually happened and threads no state through the hot path. A per-function operator-count or locals-prefix mismatch (e.g. memory-rebasing inserted scratch ops) aborts the remap.
Correct-or-strip. gimli::write::Dwarf::from is all-or-nothing on addresses — a None from convert_address aborts the whole conversion. That's used as the safety gate: only the structurally-invariant code-section base (address 0) is special-cased; any other unmapped address fails the conversion and falls back to stripping. Never emit a wrong address.
Single-DWARF-source scope. DWARF is per-input-core-module but the output has one code section. Merging N independent DWARF unit sets is a separate fidelity problem — deferred. Exactly one DWARF-bearing source → full remap; more than one → strip + warn; zero → no-op. The common single-component case (main module carries DWARF, hand-written adapter modules don't) is fully served.

A three-pass encode keeps the remapped .debug_* inside the attestation/provenance-hashed bytes (trailing custom sections don't shift code offsets, so the remap built from pass A is valid for the final output).

Verification

LS-D-1 (new, approved): wrong remapped DWARF address → de-grounded downstream coverage/breakpoints. Gated by dwarf::tests::ls_d_1_remap_translates_low_pc — a full gimli read→convert→write→read oracle that builds real input DWARF, remaps a subprogram's low_pc, re-parses the output DWARF, and asserts the address was actually translated.
Parallel-walk unit tests: identity round-trip from real wasm bytes + operator-count-mismatch abort.
Multi-source strip-fallback integration test on lists.wasm.
The six translate_* unit tests (3a) pin the remap math.
tools/run_ls_verification.py: [ OK ] LS-D-1 (1 pass).

Residual (documented in LS-D-1): DW_AT_high_pc encoded as a length (DW_FORM_data*, the common Rust/LLVM encoding) is copied verbatim — gimli treats it as a constant, not an address — so a function's reported byte length may be off by intra-function LEB drift. low_pc and the line-number program (what debuggers and pulseengine/witness use) are correct.

Notes

Adds the gimli dependency.
dwarf.rs Tier-5 registration is a separate follow-up PR (the claude-code-action byte-identical-workflow constraint forbids bundling a mythos-auto.yml edit with code changes).

🤖 Generated with Claude Code

Wire the v0.18.0 AddressRemap engine into a real `.debug_*` rewrite. `DwarfHandling::Remap` reads an input core module's DWARF, translates every code address to the fused code section, and re-serializes a single remapped set via `gimli::write::Dwarf::from`. Exposed as `meld fuse --dwarf remap`. Design (de-risked): - Recover the per-function instruction offset map POST-HOC by walking the input and final-output operator streams in lockstep, rather than capturing during the merge — so it reflects the adapter-wiring re-rewrite and threads no state through the hot path. A per-function operator-count / locals-prefix mismatch aborts the remap. - Correct-or-strip: `gimli::write::Dwarf::from` is all-or-nothing on addresses, used as the safety gate. Only the code-section base (address 0) is special-cased; any other unmapped address fails the conversion and falls back to stripping — never a wrong address. - Single DWARF source supported; multi-source inputs strip with a warning (merging independent unit sets deferred). Zero sources is a no-op. - Three-pass encode so remapped `.debug_*` land in the attestation/ provenance-hashed bytes (trailing custom sections don't shift code offsets). Verification: - New LS-D-1 (approved): wrong remapped address -> de-grounded downstream coverage/breakpoints. Gated by `dwarf::tests::ls_d_1_remap_translates_low_pc` -- a full gimli read->convert->write->read oracle asserting low_pc is actually translated. Plus the parallel-walk unit tests (identity + abort) and the multi-source strip-fallback integration test on lists.wasm. - Residual: `DW_AT_high_pc` as a length is copied verbatim (may be off by intra-function LEB drift); low_pc + line program are correct. Adds the `gimli` dependency. dwarf.rs Tier-5 registration follows in a separate workflow PR (byte-identical-workflow constraint). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-29T17:54:09Z

LS-N verification gate

⚠️ 36/38 verified — 2 missing regression tests

	count
Passed (≥1 test, all green)	36
Failed (≥1 test failure)	0
Missing (no `ls__NN_` test found)	2

_{Approved loss-scenarios.yaml entries are expected to have a

regression test named ls_<letter>_<num>_* (e.g. LS-A-11 →

ls_a_11_*). The gate runs each prefix via cargo test --lib --no-fail-fast and aggregates pass/fail/missing.}

Failed LS entries

(none)

Missing regression tests

LS-R-13
LS-M-6

_{Updated automatically by tools/post_verification_comment.py.

Source of truth: safety/stpa/loss-scenarios.yaml.}

`meld-core/src/dwarf.rs` (the DWARF AddressRemap engine + the `DwarfHandling::Remap` rewrite, #143) is correctness-critical: a wrong remapped code address silently de-grounds downstream coverage and breakpoints (LS-D-1). Add it to the Mythos auto-scan Tier-5 file list so future diffs get the clean-room AI delta pass. Standalone workflow-only change: the claude-code-action identity check requires the workflow file be byte-identical to main, so this cannot be bundled with the inc 3b code PR (#206). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avrabe mentioned this pull request May 29, 2026

chore(mythos): register dwarf.rs as Tier-5 #207

Merged

avrabe merged commit b028e43 into main May 29, 2026
14 checks passed

avrabe deleted the feat/dwarf-remap-inc3b branch May 29, 2026 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dwarf): DwarfHandling::Remap end-to-end (#143 DWARF Phase 2 inc 3b)#206

feat(dwarf): DwarfHandling::Remap end-to-end (#143 DWARF Phase 2 inc 3b)#206
avrabe merged 1 commit into
mainfrom
feat/dwarf-remap-inc3b

avrabe commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented May 29, 2026

Summary

Design — three decisions that de-risk it

Verification

Notes

Uh oh!

github-actions Bot commented May 29, 2026

LS-N verification gate

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant